Adaptive block transform image coding method and apparatus

ABSTRACT

In a method and apparatus for transmitting a digital image over a limited bandwidth communication channel, an image is block transformed to produce blocks of transform coefficients; the transform coefficients are quantized in accordance with a model of the visibility of quantization error in the presence of image detail; the quantized coefficients are encoded with a minimum redundancy code; and the coded, quantized transform coefficients are transmitted.

RELATED APPLICATIONS

U.S. Ser. No. 057,066; filed June 2, 1987

U.S. Ser. No. 057,410; filed June 2, 1987

U.S. Ser. No. 057,413; filed June 2, 1987

U.S. Ser. No. 057,585; filed June 2, 1987

U.S. Ser. No. 057,595; filed June 2, 1987

U.S. Ser. No. 057,596; filed June 2, 1987

TECHNICAL FIELD

The present invention relates to block transform digital imagecompression and transmission methods and apparatus, and moreparticularly to such methods and apparatus exploiting characteristics ofthe human visual system for increased image compression.

BACKGROUND ART

It is well known to employ block transform coding of digital images forbandwidth compression prior to transmission over a limited bandwidthcommunication channel. In a typical prior art digital image compressionand transmission system employing block transform coding (see U.S. Pat.No. 4,302,775 issued Nov. 24, 1981 to Widergren et al), the digitalimage is formatted into blocks (e.g. 16×16 pixels) and a spatialfrequency transformation such as a discrete cosine transform (DCT) isapplied to each block to generate 16×16 blocks of transformcoefficients. Each block of transform coefficients is ordered into aone-dimensional vector such that the frequencies represented by thecoefficients generally increase along the vector. The transformcoefficients are quantized and coded using a minimum redundancy codingscheme such as Huffman coding, and run length coding for runs ofcoefficients having zero magnitude. The coded transform coefficients aretransmitted over the limited bandwidth channel.

At the receiver, the image signal is decoded using operations that arethe inverse of those employed to encode the digital image. Thistechnique is capable of producing advantageously high image compressionratios, thereby enabling low bit rate transmission of digital imagesover limited bandwidth communication channels.

It has been suggested that further improvements in image quality,without increasing the low bit rates, or alternatively even lower bitrates with the same quality of image, may be achieved by weighting thequantization of the transformed coefficients in accordance with thesensitivity of the human visual system to spatial frequencies (see "AVisual Weighted Cosine Transform for Image Compression and QualityAssessment" by N. B. Nill, IEEE Transactions on Communications, Vol.COM-33, pg. 551-557).

Block adaptive transform coding scheme have been proposed whereintransform blocks are sorted into classes by the level of image activitypresent in the blocks. Within each activity level, coding bits areallocated to individual transform coefficients with more bits beingassigned to "busy" areas of the image and fewer bits assigned to "quiet"areas. (See "Adaptive Coding of Monochrome and Color Images" by W. H.Chen and C. H. Smith, IEEE Transactions on Communications, Vol. COM-25,No. 11, November 1977, pg 1285-1292). Although such block adaptivecoding schemes achieve low overall bit rates, with low image distortion(in the sense of mean square error between the pixel values of theoriginal image and the transmitted image) they fail to take into accountthe fact that transmission errors (e.g. quantization noise) in "busy"regions of the image are less visible than in "quiet" regions due to thephenomenon of frequency masking. U.S. Pat. No. 4,268,861 issued May 19,1981, to Schreiber et al is an example of a non block transform imagecoding process that takes the frequency masking phenomenon into account.In the image coding system described by Schreiber et al, the imagesignal is separated into low, middle, and high frequency components. Thelow frequency component is finely quantized, and the high frequencycomponent is coarsely quantized. Since the high frequency componentcontributes to image detail areas, the noise from the coarsequantization is hopefully less visible in such areas.

It is the object of the present invention to provide a block transformimage compression technique that produces a further compression of thedigital image. It is a further object of the present invention toprovide a block transform image compression technique that takesadvantage of the phenomenon of frequency masking, wherein noise is lessvisible in regions of an image having high frequency detail.

DISCLOSURE OF THE INVENTION

The objects of the present invention are achieved in a block transformimage compression technique by accounting in the quantization step forthe fact that the human visual system is less sensitive to noise in thepresence of image detail. Accordingly, in a method or apparatus forcoding and transmitting a digital image over a limited bandwidthcommunication channel, in a transmitter or transceiver, a twodimensional spatial frequency block transformation is performed on adigital image to produce blocks of transform coefficients. The transformcoefficients are quantized in accordance with a model of the visibilityof quantization noise in the presence of image detail. The quantizedtransform coefficients are encoded and transmitted. In the preferredmode of practicing the invention, the transform coefficients arequantized by arranging the coefficients from a block into a onedimensional vector in order of increasing spatial frequency. Thecoefficients in the vector are sequentially quantized starting with thecoefficient representing the lowest frequency, by forming an estimate ofthe contrast of the image structure in the block from the previouscoefficients in the vector, and determining the quantization for thecurrent coefficient as the function of the contrast estimate. Thefunction relates the contrast estimate to the visibility of quantizationerror in the presence of image detail having such contrast.

According to a further aspect of the present invention, image detailcharacterized by an edge separating uniform areas in a block isdetected, and the quantization based upon contrast is disabled when suchan edge is detected, thereby improving the performance of the technique.In the preferred implementation of the invention, the adaptivequantization is implemented by adaptive normalization followed by fixedquantization.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing a system for compressing andtransmitting digital images according to the present invention;

FIG. 2 is a block diagram showing further details of the determinationof normalization factors in FIG. 1;

FIG. 3 is a graph useful in describing the concept of visual masking;

FIG. 4 is a graph of the values stored in the look up table shown inFIG. 2;

FIG. 5 is a block diagram showing further details of the recovery ofnormalization factors in the receiver shown in FIG. 1;

FIG. 6 is a block diagram showing how the block adaptive normalizationaccording to the present invention is combined with global visualresponse normalization in the transmitter;

FIG. 7 is a block diagram showing how a receiver denormalizes thecoefficients generated according to the process shown in FIG. 6;

FIG. 8 is a schematic diagram of a communication system employingtransceivers useful according to the present invention;

FIG. 9 is a set of graphs showing the processed values resulting fromprocessing an image block having a low image activity according to thepresent invention;

FIG. 10 is a set of graphs similar to those of FIG. 9, showing a blockhaving high image activity; and

FIG. 11 is a set of graphs similar to those shown in FIGS. 9 and 10showing a block having a high contrast edge.

MODES OF CARRYING OUT THE INVENTION

Before describing the practice of the invention, it will be helpful todiscuss the nature of the artifacts caused by DCT processing and howthey arise. In DCT compression schemes of the type employing visuallyweighted quantization, the bit rate is reduced by effectively increasingthe quantization intervals for the DCT coefficients until thequantization noise is just below the threshold of visibility. Inpractice, the actual quantization step remains constant for allcoefficients but is effectively varied by a preceding normalization stepwhich divides the coefficients by some number, referred to as thenormalization factor. The result of the normalization step is thenquantized, typically by rounding to the nearest integer. A highernormalization factor will result in a lower input range to the fixedquantizer, which in turn results in fewer output quantization levels.Fewer quantization levels over the coefficient's dynamic range willresult in larger quantization intervals after an inverse normalizationstep has been performed at the receiver. The inverse normalization stepis merely a multiplication by the same normalization value employed atthe transmitter. Compression results from the subsequent use of Huffmancoding for amplitudes of the coefficients which have reduced values dueto the normalization process, and run-length coding for the coefficientswhich are quantized to zero.

Errors from the quantization process arise when the DCT coefficients arerounded to either the nearest higher quantization level or the nearestlower quantization level. The values of the DCT coefficients basicallyrepresent the amplitudes of spatial frequency components of an imageblock, wherein the absolute value is proportional to image contrast,while the sign of the coefficient determines the phase. Thus, therounding process in the quantization step results in the possibilitythat a spatial frequency component may have an incorrect contrast. Thequantization process employed in the following description will berounding to the nearest quantization level, although other types ofrounding, such as truncation (or rounding down), may be employed.

Nearest level rounding can produce several results. If the nearest levelhappens to be lower than the coefficients original value, the spatialfrequency component represented by the coefficient will have a reducedcontrast. If the quantized value is higher than the original value, thespatial frequency component will appear with a higher contrast. Withnearest level rounding type quantization, the maximum error is boundedby half the quantization interval. If the error is large enough, thespatial frequency component becomes clearly visible, appearing to besuperimposed over the original image block. When many coefficient valuesare incorrect, the appearance of the errors approaches that of whitenoise.

The quantization errors in the DCT coefficient values result in spatialfrequency components having either too high or too low a contrast, withthe maximum contrast error bounded by one half the width of thequantization interval. The most straightforward way of applying humanvisual data to the quantization process is to use the spatial frequencycontrast sensitivity function (CSF) as described in the Nill articlenoted above. The CSF is derived by taking the inverse of the visualcontrast threshold, which describes the contrast at which a particularspatial frequency becomes detectable. Using the CSF the effectivequantization interval is allowed to be as large as possible withoutresulting in the visibility of quantization error. In implementing sucha scheme, the normalization value for a DCT coefficient is madeproportional to the inverse of the contrast sensitivity for the spatialfrequency represented by The DCT coefficient. It is advantageous toperform these calculations in a nonlinear visual space for amplitude ofthe code values of the image. Psycho-physical research indicates thatthis space is very close to a one third power function of displayintensity for average viewing conditions.

The visual weighting of the coefficients as described above is valid forsituations which are consistent with the experiments employed to measurethe CSF, which also happens to be the most critical viewing conditions:i.e. when the spatial frequency component error appears in an otherwiseuniform field. However, the human visual contrast sensitivity todifferent spatial frequencies in the presence of image structure is muchless than that in the presence of a uniform field. This property isreferred to as visual masking and is utilized in the present inventionto improve the image quality or reduce the bit rate in a block transformimage compression technique. Since the spatial frequency contrast errorsare occurring in the presence of the original image, their visibility ismasked by the inherent image structure. Thus, the quantization errorscan be allowed to be larger than that ascertained merely from the CSF,and if performed correctly no new visible errors will be introduced, yetthe bit rate can be reduced due to the larger quantization intervals.

The dependence on the visibility of image structure in the presence ofnoise is well studied. FIG. 3 is a graph of psycho-physical experimentaldata showing the affect on the threshold visibility of a single spatialfrequency in the presence of white noise, plotted on a log--log scale.The ordinate of FIG. 3 is the log threshold contrast for visibility ofthe spatial frequency, and the abscissa of FIG. 3 is the log RMScontrast of the noise. As shown in FIG. 3, the threshold contrast T_(a)for visibility of a spatial frequency is not substantially affecteduntil the noise contrast reaches a critical value N_(crit), above whichthe effect of noise on the visibility threshold is essentially astraight line having a slope of one in log--log space.

This general linear relationship has been found to hold for all spatialfrequencies in the presence of noise, although the threshold contrastT_(a) and critical noise value N_(crit) varies somewhat as functions ofspatial frequency.

The results from other psycho-physical experiments on the effects of lowpass noise having a pass band with a cut off less than the spatialfrequency under consideration and high pass noise having a pass bandwith a cut off higher than the spatial frequency under consideration onthe visibility of spatial frequencies shows that the masking effectincreases as the cut off frequency of the pass band of the noiseapproaches the spatial frequency under consideration, and a maximumoccurs when the cut off frequency of the noise is equal to the spatialfrequency for which the visibility threshold is being measured. At thispoint, the visibility of the spatial frequency in the presence of lowpass or high pass noise depends upon the magnitude of the noise in thesame manner as shown in FIG. 3. The experiments also show that theeffects on visibility of a spatial frequency are greater in the presenceof low pass noise than in the presence of high pass noise.

By reversing the roles of signal and noise in the above description, itcan be appreciated how the visibility of quantization noise in an imageis masked by the presence of image detail. The present invention takesadvantage of this fact in a block transform digital image coding schemeto significantly improve the amount of compression achievable.

A block diagram of a system for compressing and transmitting a digitalimage according to the present invention is shown in FIG. 1. Atransmitter 10 acquires a digital image from a source (not shown) suchas an image sensor, film scanner or a digital image recorder. Thedigital image comprises for example 512×512 8-bit pixels. Thetransmitter 10 compresses and encodes the digital image, and suppliesthe encoded digital image signal to a limited bandwidth communicationchannel 12 such as a standard 3.3 khz bandwidth telephone line. Theencoded digital image signal is received from the channel 12 by areceiver 14 that decodes the compressed digital image signal andreconstructs the digital image.

Transmitter 10

The transmitter 10 receives the digital image I and formats (16) theimage into blocks I(x,y). The currently preferred block size is 16×16pixels. A two-dimensional discrete cosine transform is performed (18) oneach block to generate the corresponding block T(i,j) of transform (2 -D DCT) coefficients. Since the 2 - D DCT is a well known procedure, (seeabove referred U.S. Pat. No. 4,302,775) no further description will begiven herein of the (2 - D DCT) operation. The transform coefficientsT(i,j) for each block are ordered (20) into a one-dimensional array T(k)in order of increasing spatial frequency, for example by employing azig-zag scan along diagonals of the block of coefficients.

Next, the coefficients are adaptively quantized (22) in accordance withthe visibility of quantization noise in the presence of image detailwithin a block. According to the preferred mode of practicing theinvention, the adaptive quantization (22) is accomplished by variablenormalization (24) prior to a fixed quantization (26). Alternatively, avariable quantization could be employed. The transform coefficients T(k)are normalized by dividing each transform coefficient by a normalizationfactor N(k) as follows

    TN(k)=T(k)/N(k)                                            (1)

where TN(k) is the normalized transform coefficient value. Thenormalization factor N(k) is determined (28) as described below based onthe visibility of quantization noise in the presence of image detail inthe block. The normalized coefficients TN(k) are quantized (26) to formquantized coefficients TN(k). The quantized coefficients are encoded(30) using a minimum redundancy coding scheme to produce code valuesCV(k). A presently preferred coding scheme is a Huffman code withrun-length coding for strings of zero magnitude coefficients. SinceHuffman and run-length coding are well known in the art, (see abovereference U.S. Pat. No. 4,302,775) no further description of the codingprocess will be given herein. The coded coefficients are transmittedover the channel 12 to receiver 14.

Receiver 14

The receiver 14 performs the inverse of the operations performed by thetransmitter 10 to recover the digital image. The code values CV(k) aredecoded (32) to produce normalized coefficients TN(k). The normalizedcoefficients TN(k) are denormalized (34) employing denormalizationvalues N⁻¹ (k) that are the inverse of the normalization array N(k)employed in the transmitter to produce the denormalized coefficientsT(k). Alternatively, the transform coefficients are denormalized bymultiplying by the normalization coefficients N(k). The denormalizationvalues N⁻¹ (k) are recovered (36) at the receiver from the coefficientvalues as described in more detail below.

The one-dimensional string of reconstructed coefficient values T(k) arere-formated (38) into two-dimensional blocks T(i,j) and the blocks ofcoefficients are inversely transformed (40) into image values I(x,y).Finally, the blocks of image values are re-formatted (42) into thedigital image I.

Determination of Normalization Factors (28)

Since the transform coefficients T(k) to be normalized are arranged inincreasing order of spatial frequency, in a sequential processingscheme, information about the values of all the previous coefficientswhich represent lower spatial frequencies, is available when processingany given coefficient in the one dimensional array. The image detailrepresented by the previous coefficients is the low pass image detail.In analogy to the results of the psycho-physical experiments notedabove, the quantization noise in coefficient T(k) represents the signal,and the previous coefficients T(0)→T(k-1) represent the low pass noise(image detail) masking the visibility of the quantization noise.

The RMS contrast of the low pass image detail c_(rms) is represented by:##EQU1## The amplitude of this rms contrast will determine thevisibility threshold of the quantization error for quantized coefficientTN(k).

Based on typical display conditions (1.0 m viewing distance and a pixelspacing of 0.54 mm/pixel) the 16×16 pixel subimage blocks will subtend a0.5 by 0.5 degree visual field. It is believed, as the result ofexperiment, that the masking effect does not extend uniformly over sucha wide visual field. In fact, at as little as 0.4 degrees away from thesite of image detail, the masking effect may be less than half theamount at the detail site. This impacts the determination of thenormalization factors in that, when the detail in a subimage is nothomogeneous, the masking factor determined from the previouscoefficients may not be appropriate. An example would be a subimageblock containing two uniform areas widely of differing grey level. Thehigh contrast edge produced by this discontinuity between grey levelswill result in relatively high amplitude lower frequency transformcoefficients for the block. The values of these coefficients wouldindicate a large amount of image detail which would result in verycoarse quantization of the higher frequency DCT coefficients. Thiscourse quantization will result in the presence of significantquantization error, which would be predicted to be masked by the imagedetail. However, quantization errors in the two smooth areas are notentirely masked by the presence of the edge, due to the limited localextent of the masking effect. A similar problem occurs in blockscontaining an area of image texture and a smooth area for the reasonnoted above. Observations of compressed and decompressed images usingthe adaptive quantization technique described above indicates that theextent of the masking effect is substantially less than 0.5°.

To avoid problems caused by sharp edges between uniform areas, adaptivenormalization is not practiced on the first m(e.g. 10) coefficients inthe block. The normalization factor for these coefficients is set to apredetermined value (e.g. 1) and the summation process is started at them+1^(st) coefficient. The detail estimate c_(rms) is started with them^(th) coefficient value, such that: ##EQU2## Similarly, in recoveringthe coefficients at the receiver, the first m coefficients aredenormalized with the predetermined constant, and the detail estimatec_(rms) is begun at the m^(th) coefficient value.

Since an edge produces energy in the transform coefficients inapproximate proportion to the inverse of the spatial frequencyrepresented by the coefficient, for very high contrast edges the highfrequency coefficients may still contribute sufficient value to thedetail estimate c_(rms) to produce an incorrectly calculated maskedeffect. According to a further refinement of the present invention, thissituation is accounted for by employing an edge detector prior todetermining the normalization factor, and disabling the adaptivenormalization when an edge is detected. A simple edge detector isimplemented by summing the absolute values of the first m coefficientsand comparing the sum to a predetermined value C₁ to determine whetheran edge is present. When a high contrast edge is detected, the adaptivenormalization is disabled for the block by setting all of thenormalization factors equal to one.

A more sophisticated edge detector may be implemented at the cost ofincreased computation complexity by calculating the ratio of thevariance of the low frequencies in the image block to the variance ofall the frequencies in the block. A high ratio will indicate thepresence of a high contrast edge. The calculation of the variance ratiomay occur in parallel with the calculation of the DCT, and the resultsemployed when determining the normalization factors.

Referring now to FIG. 2, the steps involved in determining thenormalization factors N(k) will be described in more detail. Thenormalization factors for the first m coefficients are set equal to one(44). An edge detect value D_(e) is computed (according to the simplemethod described above) by summing the absolute values of the first mcoefficients (46). The value of D_(e) is compared to the predeterminedthreshold C₁ (48). If the edge detect value is greater than thethreshold C₁, an edge has been detected, and further adaptivenormalization is disabled by setting the remainder of the normalizationfactors equal to one (50).

If the edge detect value is less than or equal to the predeterminedconstant C₁, subsequent coefficients are denormalized (52). This is doneto enable the normalization factors to be recovered at the receiverwithout error. Using the normalized quantized coefficients at thetransmitter to determine the normalization factors insures that thevalues later employed at the receiver will be identical. Thecoefficients are processed sequentially, and the coefficient TN(k-1) tobe denormalized is held over from the previous processing cycle by delay(54). A detail estimated c_(rms) is computed (56) according to equation(3) above. The detail estimate c_(rms) is employed (58) to address alook up table (60) that contains the normalization factor values N(k).The normalization factor N(k) is employed to normalize the coefficientT(k) prior to quantization and will be employed in the next cycle todenormalize the quantized coefficient TN(k). Denormalization factorsstored in look up table (60) are generated empirically from therelationship shown in FIG. 3.

Although the square root of the sum of the squares of the coefficientvalues is the preferred estimate of image detail for selecting thenormalization factors, the square and square root operations arecomputationally intensive. To provide a more computationally efficientprocess that can be accomplished in less time by less sophisticatedhardware (e.g. a microprocessor) with only a slight reduction incompression efficiency, the sum of the absolute values of thedenormalized quantized coefficients may be employed as the detailestimate c_(rms). This alternative method of forming the detail estimateis shown in dashed box (56') in FIG. 2.

FIG. 4 shows a plot of the actual values used as normalization factorsN(k) for coefficients T(k) versus the sum of the absolute values of thecoefficients from m to k-1, where m is 10. The lower flat portion 58 ofthe curve in FIG. 4 reflects the threshold portion of the curve in FIG.3. The upper flat portion 60 of the curve in FIG. 4 is imposed by thelimited number of bits (e.g. 10) used in the code word to define thenormalization factor N(k). The slope of one in the sloping portion (62)of the curve matches the slope of the curve in FIG. 3. The requireddynamic range of the sloping portion 62 of the curve in FIG. 4 wasdetermined empiricalliy by observing the effects of compression anddecompression on an assortment of digitized photographic images. Adynamic range was chosen consistent with the number of bits in the codeword to produce maximum image compression without introducing visiblequantization noise in the processed image.

Recovery of the Denormalization Factors (36)

The recovery of the denormalization factors N⁻¹ (k) at the receiverduplicates the process of their generation at the transmitter, and willnow be described with reference to FIG. 5. An edge detect value D_(e) iscomputed (64) by summing the absolute values of the first mcoefficients. The edge detect value is compared with the predeterminedthreshold C₁ (66) to determine if an edge is present in the subportionof the image. If an edge is detected, all of the denormalization factorsare set equal to one (68). If an edge is not detected, denormalizationfactors are determined for subsequent coefficients by forming a detailestimate (72) c_(rms) for each coefficient. The detail estimate is thesquare root of the sum of the squares of previous denormalizedcoefficient values from the m^(th) coefficient value to the immediatelyprevious value (k-1^(st)). A running sum is accumulated in the previousdenormalized value is supplied via a one cycle delay (74). The detailestimate is employed to address (76) a look up table (78) that containsthe denormalization factors, which are the reciprocals of thenormalization factors employed in the transmitter 10. Alternatively, thevalues stored in look up table (78) may be identical to values stored inthe look up table (60) in the transmitter, and the denormalization maybe implemented by multiplying by the normalization factors.

Of course, if the more computationally efficient procedure using the sumof the absolute values of the coefficient to compute the detail estimateis employed in the transmitter, the detail estimate will be likewiseformed at the receiver, as shown in dashed block 72' in FIG. 5.

The block adaptive transform coding scheme according to the presentinvention can also be combined with a global visual weightingquantization scheme to produce even further improvements in compressionratio. In a global visually weighted quantization scheme, a globalnormalization array representing the relative human visual response toeach of the spatial frequencies represented by the corresponding DCTcoefficients is applied to all the blocks of the image.

An improvement to this visual weighting technique, wherein the reducedhuman visual response to diagonally oriented spatial frequencies istaken into account is disclosed in copending patent application Ser. No.057,413 entitled "Digital Image Compression and Transmitting SystemEmploying Visually Weighted Transform Coefficient Normalization" by thepresent inventors, filed on even date herewith.

FIG. 6 illustrates the manner in which the human visual weightingtechnique is combined with the adaptive normalization technique in thetransmitter. A global normalization array 80 contains normalizationfactors representing the relative human visual response to the spatialfrequencies represented by the DCT coefficients. The local normalizationfactors based upon the image detail in the block are determined (28) asdescribed above. The global normalization factor G(k) for the k^(th)coefficient is multiplied (82) by the local normalization factor N(k),to produce the final normalization factor to normalize (24) thecoefficient T(k). The only change to the details of the computation ofthe local normalization factor N(k) as described in FIG. 2 involvesemploying the final normalization factor G(k)×N(k) in thedenormalization (52) of the coefficient. This slight change is indicatedby dashed lines in FIG. 2.

The recovery of the final normalization factor at the receiver is shownin FIG. 7. After the local denormalization factor N⁻¹ (k) is determined(36) it is multiplied (81) by a global denormalization factor G⁻¹ (k)from a global denormalization array (83). The global denormalizationarray values are the reciprocals of the global normalization values. Nomodification to the details for determining the local denormalizationfactors as shown in FIG. 5 are required.

Working Example

Referring now to FIG. 8, a preferred implementation of the presentinvention in a still video communication system will be described. Thesystem includes two or more transceivers 84 coupled to a telephonetransmission line 86. Each of the transceivers 84 is connected to avideo signal source such as a video camera 88, and to a video displaysuch as a video monitor 90. Each transceiver 84 contains a standardvideo interface 92 that receives video signals from the video source,digitizes the signals, and supplies the digital image signals to adigital frame store 94. The video interface 92 also receives digitalimage signals from the digital frame store 94 and produces a standardvideo signal for display on the video monitor 90.

Each transceiver is controlled by an Intel 80186 microprocessor 96having conventional ROM 98 and RAM 100 for storing the control programsand temporary storage of data respectively. The microprocessor 96performs the run-length and Huffman coding and decoding, and the blockadaptive normalization and denormalization on the DCT coefficients. Thecoded DCT coefficients are sent and received over a telephone line 86via an R96 FT/SC modem 102. The forward discrete cosine transforms DCT(in the transmitting mode) and reverse transforms (in the receivingmode) are performed by a TMS 32020 Digital Signal Processor 104 having aconventional RAM 105 for storing the DCT transform program.

In the transmitting mode, the microprocessor 96 retrieves one 16×16block of digital image data at a time from an image buffer 106 in thedigital frame store 94. The 16×16 block of digital image data istemporarily stored in a dual port SRAM 108, that is accessible by boththe microprocessor 96 and the digital signal processor 104. The digitalsignal processor 104 performs the discrete cosine transform and returnsthe 16×16 block of transform coefficients to the dual port SRAM 108. Theblock of transform coefficients are then normalized and compressed(Huffman and run-length encoded) by the microprocessor 96. Thecompressed signal is stored in a compressed image buffer 110 in digitalframe store 94 and transmitted at the data rate of the modem 102. Thiscycle is repeated on each block until the entire image has beencompressed, and transmitted.

In the receiving mode, a compressed digital image is received via modem102 and stored in compressed image buffer 110. One block at a time ofcompressed DCT coefficients is retrieved from the compressed imagebuffer 110 and denormalized and expanded by microprocessor 96. Theexpanded block of DCT coefficients is supplied to dual port SRAM 108.The digital signal processor 104 inversely transforms the coefficientsto produce a 16×16 block of digital image values, which are temporarilystored in SRAM 108. Microprocessor 96 transfers the block of digitalimage values from the dual port SRAM 108 to image buffer 106. This cycleis repeated until the entire image has been received decompressed andstored in image buffer 106. The image is displayed as it is reviewed onthe video monitor 90 via video interface 92.

The Digital Signal Processor 104 was programmed to implement thecombination of block adaptive and global visual weighting describedabove. FIG. 9 shows the results of the processing steps for compressionand reconstruction of a single 16×16 pixel image block having lowamplitude detail. Although the image blocks were processed in lineararray form, they are shown here as two dimensional arrays to aid invisualization of the processing. Block A in FIG. 9 shows the input imagevalues I(x,y). Block B shows the values of the transform coefficientsT(x,y). Block C shows the local normalization factors N(x,y) which areall ones in this case due to the low amplitude of the image detail.Block D shows the final normalization factors G(x,y) x N(x,y) comprisingthe adaptively determined local normalization factors N(x,y) times theglobal human visual response normalization factors G(x,y). Because thelocal normalization factors are all ones, block D is simply the globalnormalization array. The global normalization array employed here takesinto account the reduced response of the human visual system todiagonally oriented spatial frequencies, hence the appearance of thediagonal ridge 110 in the two-dimensional plot of the normalizationvalues. Block E shows the quantized and denormalized coefficient valuesT(x,y) as recovered at the receiver. Block F shows the reconstructedimage block I(x,y) at the receiver.

FIG. 10 is a plot similar to FIG. 9, showing the processing steps forcompression and expansion of an image block A having a high amplitudeimage detail. Comparing the local normalization factors of block C inFIG. 10 with the global normalization factors of block D in FIG. 9, itcan be appreciated that the block adaptive normalization method of thepresent invention will provide significantly further compression of theimage.

FIG. 11 is a group of plots similar to FIGS. 9 and 10 showing an imageblock A having a high amplitude edge. The presence of the edge resultedin high amplitude, low frequency coefficients as seen in block B. Thepresence of the edge was detected, and the local normalization (factorsblock C) were all set equal to one.

A large variety of images were compressed and reconstructed according tothe present invention. On the average, a 15 percent improvement incompression ratio was achieved by the block adaptive normalizationtechnique.

Although the present invention has been described with reference to amonochromatic digital image, it will be readily apparent that thetechnique described can also be applied to a color digital image, forexample by separating the image into a luminance component andchrominance component, and applying the block adaptive normalizationtechnique to the luminance component. Generally, since the chrominancecomponent is of lower resolution than the luminance component, the gainsto be made by applying the more sophisticated compression techniques (ofwhich the present invention is an example) to the chrominance componentdo not justify the added complexity. However, the present inventors haveachieved excellent results in compressing color digital images byseparating the images into a luminance component and two lowerresolution chrominance components, and applying the block adaptivetransform technique to the luminance components, and the DCT transformwithout block adaptive feature to the chrominance components.

Industrial Applicability and Advantages

The present invention is useful in digital image transmission systemsfor transmitting a digital image over a narrow-band communicationchannel. The invention produces improved compression of the digitalimage without introducing visible artifacts, thereby enablingimprovements in image quality for the same transmission time, or fastertransmission times for the same quality of image, or allowing the use ofnarrower bandwidth communication channels for the same transmission timeand image quality.

We claim:
 1. A transmitter for compressing and transmitting a digitalimage over a limited bandwidth communication channel, comprising:a.means for performing a two-dimensional spatial frequency blocktransformation on the digital image to produce blocks of transformcoefficients; b. means for quantizing the transform coefficients inaccordance with a model of the visibility of quantization error in thepresence of image detail; c. means for encoding the quantizedtransformation coefficients with a minimum redundancy code; and d. meansfor transmitting the encoded transform coefficients.
 2. The transmitterclaimed in claim 1, wherein said means for quantizing the transformcoefficients, comprises:a. means for normalizing the coefficients inaccordance with a model of the visibility of quantization errors in thepresence of image detail; and b. means for quantizing the normalizedcoefficients.
 3. The transmitter claimed in claim 1, further includingmeans for globally quantizing the transform coefficients based on amodel of the human visual response to the spatial frequenciesrepresented by the transform coefficients.
 4. The transmitter claimed inclaim 1, wherein said means for quantizing transform coefficients,comprises;a. means for arranging the coefficients from a block into aone-dimensional array in order of decreasing spatial frequency; and b.means for sequentially quantizing the coefficients in the array,starting with the coefficient representing the lowest frequencyincluding: (1) means for forming an estimate of the contrast of theimage structure in the block from the previous coefficient values in thearray; and (2) means for determining the quantization for the currentcoefficient as a function of the contrast estimate, said functionrelating the contrast estimate to the visibility of quantization errorin the presence of image detail having such contrast.
 5. The transmitterclaimed in claim 4, wherein the previous coefficients employed to forman estimate of the contrast of the image detail are the quantizedcoefficients, whereby the quantization values may be recovered from thequantized signal value at a receiver without error.
 6. The transmitterclaimed in claim 4, wherein said means for sequentially quantizingprovides a predetermined quantization for the first m coefficients. 7.The transmitter claimed in claim 6 wherein said means for sequentiallyquantizing the coefficients includes means for detecting the presence ofan edge separating uniform image areas in the block, and means forproviding a predetermined quantization for all the coefficients in thearray when such an edge is detected.
 8. A method for compressing adigital image for transmission over a limited bandwidth communicationchannel, comprising the steps of:a. performing a two-dimensional spatialfrequency block transformation on the digital image to produce blocks oftransform coefficients; b. quantizing the transform coefficients inaccordance with a model of the visibility of quantization error in thepresence of image detail; and c. encoding the quantized transformationcoefficients employing a minimum redundancy code.
 9. The method claimedin claim 8, wherein said step of quantizing the transform coefficients,comprises the steps of:a. normalizing the coefficients in accordancewith a model of the visibility of quantization errors in the presence ofimage detail; and b. quantizing the normalized coefficients.
 10. Themethod claimed in claim 8, further including globally quantizing thetransform coefficients based on a model of the human visual response tothe spatial frequencies represented by the transform coefficients. 11.The method claimed in claim 8, wherein said step quantizing transformcoefficients, comprises the steps of:a. arranging the coefficients froma block into a one-dimensional array in order of increasing spatialfrequency; and b. sequentially quantizing the coefficients in the array,starting with the coefficient representing the lowest frequencyincluding the steps of:(1) forming an estimate of the contrast of theimage structure in the block from the previous coefficient values in thearray; and (2) determining the quantization for the current coefficientas a function of the contrast estimate, said function relating thecontrast estimate to the visibility of quantization error in thepresence of image detail having such contrast.
 12. The method claimed inclaim 11, wherein the previous coefficients employed to form an estimateof the contrast of the image detail are the quantized coefficients,whereby the quantization values may be recovered from the quantizedsignal value at a receiver without error.
 13. The method claimed inclaim 11, wherein said steps of sequentially quantizes the first mcoefficients with a predetermined step size.
 14. The method claimed inclaim 13 wherein said step of sequentially quantizing the coefficientsincludes detecting the presence of an edge separating uniform imageareas in the block, and quantizing for all the coefficients in the arraywith a predetermined step size when such an edge is detected.